Morphology Without Borders: Clause-Level Morphology

نویسندگان

چکیده

Abstract Morphological tasks use large multi-lingual datasets that organize words into inflection tables, which then serve as training and evaluation data for various tasks. However, a closer inspection of these reveals profound cross-linguistic inconsistencies, arise from the lack clear linguistic operational definition what is word, severely impair universality derived To overcome this deficiency, we propose to view morphology clause-level phenomenon, rather than word-level. It anchored in fixed yet inclusive set features, encapsulates all functions realized saturated clause. We deliver MightyMorph, novel dataset covering 4 typologically different languages: English, German, Turkish, Hebrew. derive 3 morphological tasks: inflection, reinflection analysis. Our experiments show are substantially harder respective word-level tasks, while having comparable complexity across languages. Furthermore, redefining provides neat interface with contextualized language models (LMs) allows assessing knowledge encoded their usability Taken together, work opens up new horizons study computational morphology, leaving ample space studying neural cross-linguistically.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

One-Level Prosodic Morphology

Recent developments in theoretical linguistics have lead to a widespread acceptance of constraint-based analyses of prosodic morphology phenomena such as truncation, infixation, floating morphemes and reduplication. Of these, reduplication is particularly challenging for state-of-the-art computational morphology, since it involves copying of some part of a phonological string. In this paper I a...

متن کامل

Two-Level Morphology with Composition

(1) Lexical representations tend to be arbitrary. Because it is difficult to write and test two-level systems that map between pairs of radically dissimilar forms, lexical representations in existing two-level analyzers tend to stay close to the surface forms. This is not a problem for morphologically simple languages like English because, for most words, inflected forms are very similar to the...

متن کامل

Unsupervised Learning of Morphology Without Morphemes

The first morphological learner based upon the theory of Whole Word Morphology (Ford et al., 1997) is outlined, and preliminary evaluation results are presented. The program, Whole Word Morphologizer, takes a POS-tagged lexicon as input, induces morphological relationships without attempting to discover or identify morphemes, and is then able to generate new words beyond the learning sample. Th...

متن کامل

Galaxy Morphology without Classification : Self Organizing Maps

We examine a general framework for visualizing datasets of high (> 2) dimensionality, and demonstrate it using the morphology of galaxies at moderate redshifts. The distributions of various populations of such galaxies are examined in a space spanned by four purely morphological parameters. Galaxy images are taken from the Hubble Space Telescope (HST) Wide Field Planetary Camera 2 (WFPC2) in th...

متن کامل

A Two-level Morphology of Malagasy

We present a two-level model of Malagasy nominal and verbal morphology (Beesley and Karttunen, 2003), based primarily on the discussion of Malagasy morphology in Keenan and Polinsky (1998) and Randriamasimanana (1986). Words in Malagasy are built from roots by means of a variety of morphological operations such as affixation and reduplication. The present paper analyzes productive patterns of n...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Transactions of the Association for Computational Linguistics

سال: 2022

ISSN: ['2307-387X']

DOI: https://doi.org/10.1162/tacl_a_00528